Hands-on_Ex3

Published

November 30, 2023

Modified

December 2, 2023

Processing and Visualizing Flow Data

Overview

Spatial interaction represents the flow of people, material, or information between locations in geographical space. It encompasses everything from freight shipments, energy flows, and the global trade in rare antiquities, to flight schedules, rush hour woes, and pedestrian foot traffic.

Each spatial interaction, as an analogy for a set of movements, is composed of a discrete origin/destination pair. Each pair can be represented as a cell in a matrix where rows are related to the locations (centroids) of origin, while columns are related to locations (centroids) of destination. Such a matrix is commonly known as an origin/destination matrix, or spatial interaction matrix.

We will by an OD Matrix by using Passenger Volume by Origin Destination Bus Stops data set downloaded in the previous exercises.

Getting Started

The following package will be loaded in with the p_load() function of pacman: sf, tidyverse, tmap, DT, stplanr.

pacman::p_load(sf, tidyverse, tmap, DT, stplanr)

Preparing the Flow Data

Importing the OD Data

We will import the Passenger Volume by Origin Destination Bus Stops using read_csv().

odbus <- read_csv('data/aspatial/origin_destination_bus_202310.csv')

glimpse() can be used to display the data type and some rows of the odbus data frame.

glimpse(odbus)
Rows: 5,694,297
Columns: 7
$ YEAR_MONTH          <chr> "2023-10", "2023-10", "2023-10", "2023-10", "2023-…
$ DAY_TYPE            <chr> "WEEKENDS/HOLIDAY", "WEEKDAY", "WEEKENDS/HOLIDAY",…
$ TIME_PER_HOUR       <dbl> 16, 16, 14, 14, 17, 17, 17, 7, 14, 14, 10, 20, 20,…
$ PT_TYPE             <chr> "BUS", "BUS", "BUS", "BUS", "BUS", "BUS", "BUS", "…
$ ORIGIN_PT_CODE      <chr> "04168", "04168", "80119", "80119", "44069", "2028…
$ DESTINATION_PT_CODE <chr> "10051", "10051", "90079", "90079", "17229", "2014…
$ TOTAL_TRIPS         <dbl> 3, 5, 3, 5, 4, 1, 24, 2, 1, 7, 3, 2, 5, 1, 1, 1, 1…

ORIGIN_PT_CODE and DESTINATION_PT_CODE are in character data type whereas they should be converted into factor data type for easier manipulation.

odbus$ORIGIN_PT_CODE <- as.factor(odbus$ORIGIN_PT_CODE)

odbus$DESTINATION_PT_CODE <- as.factor(odbus$DESTINATION_PT_CODE)

Extracting the study data

We will only look at commuting flows on weekday between 6 and 9 o’clock. This can be extracted using the filter() function.

odbus6_9 <- odbus %>%
  filter(DAY_TYPE == 'WEEKDAY') %>%
  filter(TIME_PER_HOUR >= 6 &
           TIME_PER_HOUR <= 9) %>%
  group_by(ORIGIN_PT_CODE,
           DESTINATION_PT_CODE) %>%
  summarise(TRIPS = sum(TOTAL_TRIPS))

datatable() can be used to view odbus6_9

datatable(odbus6_9)

We will save the output in rds format for future use.

write_rds(odbus6_9, 'data/rds/odbus6_9.rds')

read_rds() can be used to import the rds file into R.

odbus6_9 <- read_rds('data/rds/odbus6_9.rds')

Working with Geospatial Data

Two geospatial data sets will be used:

  • BusStop: This data provides the location of bus stops as at July 2023

  • MPSZ-2019: this data provides the sub-zone boundary of URA Master Plan 2019.

Both data sets are in ESRI shapefile format.

Importing Geospatial Data

busstop <- st_read(dsn = 'data/geospatial',
                   layer = 'BusStop') %>%
  st_transform(crs = 3414)
Reading layer `BusStop' from data source 
  `D:\phlong2023\ISSS624\Hands-on_Ex\Hands-on_Ex3\data\geospatial' 
  using driver `ESRI Shapefile'
Simple feature collection with 5161 features and 3 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 3970.122 ymin: 26482.1 xmax: 48284.56 ymax: 52983.82
Projected CRS: SVY21
mpsz <- st_read(dsn = 'data/geospatial',
                layer = 'MPSZ-2019') %>%
  st_transform(crs = 3414)
Reading layer `MPSZ-2019' from data source 
  `D:\phlong2023\ISSS624\Hands-on_Ex\Hands-on_Ex3\data\geospatial' 
  using driver `ESRI Shapefile'
Simple feature collection with 332 features and 6 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 103.6057 ymin: 1.158699 xmax: 104.0885 ymax: 1.470775
Geodetic CRS:  WGS 84
mpsz
Simple feature collection with 332 features and 6 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
Projected CRS: SVY21 / Singapore TM
First 10 features:
                 SUBZONE_N SUBZONE_C       PLN_AREA_N PLN_AREA_C       REGION_N
1              MARINA EAST    MESZ01      MARINA EAST         ME CENTRAL REGION
2         INSTITUTION HILL    RVSZ05     RIVER VALLEY         RV CENTRAL REGION
3           ROBERTSON QUAY    SRSZ01  SINGAPORE RIVER         SR CENTRAL REGION
4  JURONG ISLAND AND BUKOM    WISZ01  WESTERN ISLANDS         WI    WEST REGION
5             FORT CANNING    MUSZ02           MUSEUM         MU CENTRAL REGION
6         MARINA EAST (MP)    MPSZ05    MARINE PARADE         MP CENTRAL REGION
7                   SUDONG    WISZ03  WESTERN ISLANDS         WI    WEST REGION
8                  SEMAKAU    WISZ02  WESTERN ISLANDS         WI    WEST REGION
9           SOUTHERN GROUP    SISZ02 SOUTHERN ISLANDS         SI CENTRAL REGION
10                 SENTOSA    SISZ01 SOUTHERN ISLANDS         SI CENTRAL REGION
   REGION_C                       geometry
1        CR MULTIPOLYGON (((33222.98 29...
2        CR MULTIPOLYGON (((28481.45 30...
3        CR MULTIPOLYGON (((28087.34 30...
4        WR MULTIPOLYGON (((14557.7 304...
5        CR MULTIPOLYGON (((29542.53 31...
6        CR MULTIPOLYGON (((35279.55 30...
7        WR MULTIPOLYGON (((15772.59 21...
8        WR MULTIPOLYGON (((19843.41 21...
9        CR MULTIPOLYGON (((30870.53 22...
10       CR MULTIPOLYGON (((26879.04 26...

We can write the mpsz sf tibble data frame into an rds file for future use.

write_rds(mpsz, "data/rds/mpsz.rds")

Geospatial Data Wrangling

Combining BusStop and mpsz

We can populate the planning subzone code (SUBZONE_C) of mpsz sf data frame into busstop sf data frame

busstop_mpsz <- st_intersection(busstop, mpsz) %>%
  select(BUS_STOP_N, SUBZONE_C) %>%
  st_drop_geometry()
datatable(busstop_mpsz)

Before moving to the next step, it is wise to save the output into rds format.

write_rds(busstop_mpsz, 'data/rds/busstop_mpsz.rds')

We are going to append the planning subzone code from busstop_mpsz data frame onto odbus6_9 data frame.

od_data <- left_join(odbus6_9, busstop_mpsz, by = c('ORIGIN_PT_CODE' = 'BUS_STOP_N')) %>%
  rename(ORGIN_BS = ORIGIN_PT_CODE,
         ORIGIN_SZ = SUBZONE_C,
         DESTIN_BS = DESTINATION_PT_CODE)

Before continuing, it is a good practice for us to check for duplicating records

duplicate <- od_data %>%
  group_by_all() %>%
  filter(n()>1) %>%
  ungroup

If duplicated records are found, we can remove them using unique().

od_data <- unique(od_data)

We can reconfirm whether the duplicated records have been removed.

duplicate <- od_data %>%
  group_by_all() %>%
  filter(n()>1) %>%
  ungroup()

We can confirm that there is no more duplicate records.

Now, we can update od_data with the planning subzone codes for destination subzone.

od_data <- left_join(od_data, busstop_mpsz, by = c('DESTIN_BS'= 'BUS_STOP_N'))

Next, we can check for duplicate record and proceed to remove them

duplicate <- od_data %>%
  group_by_all() %>%
  filter(n()>1) %>%
  ungroup()

od_data <- unique(od_data)

Next, we can rename the new subzone column and summarise the total number of trips by each origin and destination subzones.

od_data <- od_data %>%
  rename(DESTIN_SZ = SUBZONE_C) %>%
  drop_na() %>%
  group_by(ORIGIN_SZ, DESTIN_SZ) %>%
  summarise(MORNING_PEAK = sum(TRIPS))

Now, we can save the output into an rds file format.

write_rds(od_data, 'data/rds/od_data.rds')
od_data <- read_rds('data/rds/od_data.rds')

Visualizing Spatial Interaction

We can prepare a desire line by using the stplanr package.

Removing Intra-zonal Flows

We will not plot the intra-zonal flows.

od_data1 <- od_data[od_data$ORIGIN_SZ != od_data$DESTIN_SZ,]

Creating Desire Lines

od2line() is used to create the desire lines.

flowLine <- od2line(flow = od_data1, 
                    zones = mpsz,
                    zone_code = 'SUBZONE_C')

Visualizing the Desire Lines

We can use the tmap package to visualize the desire line.

tmap_mode('plot')
tm_shape(mpsz)+
  tm_polygons() +
  flowLine %>%
  tm_shape()+
  tm_lines(lwd = 'MORNING_PEAK',
           style = 'quantile',
           scale = c(0.1, 1, 3, 5, 7, 10),
           n = 6,
           alpha = 0.3)

When the flow data are very messy, and highly skewed like the one shown above, it is wiser to focus on selected flows. For example, flow greater than or equal to 5000 as shown below.

tm_shape(mpsz)+
  tm_polygons()+
  flowLine %>%
  filter(MORNING_PEAK >= 5000) %>%
  tm_shape()+
  tm_lines(lwd = 'MORNING_PEAK',
           style = 'quantile',
           scale = c(0.1, 1, 3, 5, 7, 10),
           n = 6,
           alpha = 0.3)